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LOWER  CONFIDENCE  LIMIT  EXPRESSIONS  FOR  PiX>y)  AND  P(X>Y) 

UNDER  NORMALITY 
W.  M.  Woods 

Department  of  Operations  Research 
Naval  Postgraduate  School 
Monterey,  CA  93943 

Commander  Wen-Hui  Yang 
Taiwan  Navy 


Closed  form  expressions  of  approximate  lower  confidence  limits  for 
P{X>y)  and  P{X>Y)  are  presented  where  X  and  Y  are  independent  and 
have  normal  probability  distributions  with  unknown  means  and 
variances.  Each  of  the  three  expressions  requires  values  of  percentile 
points  from  the  standard  t  distribution  and  values  of  the  standard 
normal  cumulative  distribution  function  to  compute  the  lower 
confidence  limit  using  a  data  set.  The  expressions  are  shown  to  be  quite 
accurate  for  sample  sizes  of  ten  or  larger. 

KEY  WORDS:  Mechanical  reliability;  interval  estimates. 


1.  INTRODUCTION 

Throughout  this  paper  X  and  Y  are  independent  normally  distributed 

2 

variables  with  unknown  means  Hx  and  fiy  and  unknown  variances  (7x  and 
2 

Oy.  0{z)  is  the  standard  normal  CDF,  and  ta,n  is  the  100(l-a)th  percentile 

point  of  the  standard  t  distribution  with  n  degrees  of  freedom.  The  symbol 

tnip)  denotes  a  noncentral  t  variable  with  n  degrees  of  freedom  and 

2 

noncentrality  parameter  p.  The  terms  X  and  Sx  are  the  sample  mean  and 

2 

sample  variance  of  a  random  sample  of  size  n  on  X.  The  terms  Y  and  Sy  are 
the  sample  mean  and  sample  variance  of  a  random  sample  of  size  m  on  Y. 
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9  2  2 

Sp  =  [(n-DS;^  +  (m-l)Sy]  /  {n+m-2)  and  x,  y,  Sx,  Sy  ,  Sp  denote  observed  values 
of  the  corresponding  random  variables. 

If  the  random  strength,  X,  of  a  device  must  exceed  some  worst  case  stress 
value  y,  the  device  reliability  is  often  modeled  as  P(X>y).  If  the  stress  variable, 
Y,  is  also  random  the  device  reliability  is  frequently  modeled  as  P{X>Y)  which 
is  the  average  of  P(X>y)  with  respect  to  the  distribution  of  Y  when  X  and  Y  are 
independent.  In  this  paper  we  present  closed  expressions  for  approximate 
lower  confidence  limits  for  P(X>y)  and  P(X>Y).  In  the  latter  case  one 
expression  is  provided  assuming  Ox  =  Cy  and  a  different  expression  is  given 
assuming  Ox  ^  Oy. 

Several  papers  in  the  literature  present  the  minimum  variance  unbiased 
estimator  for  P(X>y)  when  [1;^  and  Cx  are  unknown.  See,  for  example, 
Lieberman  and  Resnikoff  (5),  Folks  and  others  (4),  and  Barton  (1).  Tables  of 
exact  interval  estimates  for  P(X>y)  were  developed  by  Owen  and  Hua  (6)  for 
confidence  levels  of  90%  and  95%.  The  exact  procedure  requires  a  search  in 
tables  of  the  non-central  f  distribution.  Owen  and  Hua  have  performed  this 
task  and  developed  more  convenient  tables.  One  enters  their  tables  using 
confidence  level  1-a,  sample  size  n  and  sample  statistic  value,  (i-y)/s  =  k  to 
obtain  the  lower  confidence  limit  for  P(X>y).  The  lower  limit  values  obtained 
using  our  expression  agree  with  their  values  to  the  nearest  hundredth 
decimal  point  or  better  for  sample  sizes  of  ten  or  larger  and  1  <  k  <5.  Access 
to  values  of  ta^m  and  values  of  0{z)  are  required  to  use  our  closed  form 
procedure  which  is  also  a  function  of  a,  n  and  k. 

U  Ox  =  (Jy  in  the  two  variable  case,  our  lower  confidence  limit  expression 
for  P(X>Y)  is  analogous  to  the  single  variate  case.  Values  of  and  0{z)  are 


2 


needed  to  use  this  procedure  which  is  a  function  of  sample  sizes  rix,  riy  and 
a-. '  /  Sp. 

Approximate  confidence  limits  for  P{X>Y)  have  been  developed  by 
Church  and  Harris  (2)  when  Y  has  a  standard  normal  distribution.  Downton 
(3)  modified  their  procedure  slightly  to  obtain  more  accurate  limits  and 
suggests  an  approximate  procedure  when  the  means  and  variances  of  both  X 
and  Y  are  unknown. 


2.  SUMMARY  OF  RESULTS 

For  the  single  variable  case  we  seek  a  lower  100(l-a)%  confidence  limit 
for  PiX>y)  =  R{y).  Let  S  =  {jj-x-y) / Ox,  then  R(y)  =  and  a  lower  confidence 
limit  R{y)L  for  R(y)  is  given  by 

=  ■«(«/.)  (I) 


where  5l  is  a  lower  confidence  limit  for  5. 

Owen  and  Hua  (6)  used  the  general  confidence  interval  procedure  to  find 
5i.  Specifically  5i  is  the  value  of  5  for  which 


l-a  =  P 


=  P 


(X-y)/S<(.r-y)/s] 

fn-l{(Px  -y)Vn  /  Ox)  < 


(2) 


Letting  k  =  {x-y)/s,  Owen  and  Hua  develop  tables  of  for  many  values  of 
k  in  [-3,  6],  and  sample  sizes  n  =  2(1)18,  21(3)30,  40(20)100  and  1-a  =  .90  and  .95. 
The  corresponding  values  of  5i  can  easily  be  obtained  from 

In  Section  3  of  this  paper  an  approximate  lower  confidence  limit,  5^,  for  5 
is  shown  to  be 
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Values  of  5^  were  compared  with  corresponding  values  of  Sl  from  Owen  and 
Hua's  tables  for  numerous  sets  of  (n,k)  to  obtain  a  more  accurate  expression, 
si  Let 

60  if  a  =.05,  10<n<60 

N(a,«)  =  <n  if  a  =.05,  m  >  60  (4) 

n  if  a  =.10,  n  >  10 


S,  =k-  - 


The  corresponding  approximate  100(l-a)%  lower  confidence  limit  R*iy)L  is 

R‘(yV  =  <c(5l).  (f 


Table  1  displays  values  of  R*iy)L  and  those  given  by  Owen  and  Hua,  R/,. 

For  the  two  variate  case,  we  seek  a  lower  100(l-a)%  confidence  limit  for 
P{X>Y).  We  first  assume  CTt  =  cry  =  a.  Let  R  =  P{X>Y)  and  d  =  ij.y)I o.  then 
R  =  0{d/^)  and  a  lower  confidence  limit,  Rl,  for  R  is 

Rt  =  (PidJ -J2)  (7) 

It  is  shown  in  Section  3  that  a  lower  100(l-a)%  confidence  limit,  di,  for  d  is 
the  value  of  d  for  which 

1-a  =  P  -My)/<7VOV^)  +  {T7^)^(T-y)/sp7(^7>0+^(T7^  (8) 
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TABLE  1.  APPROXIMATE  {r[)  AND  EXACT  (Rl)  CONHOENCE  LIMITS 
_ FOR  P(X>y) _ 


o 

• 

II 

a  =  .05 

n  k 

Rl 

Rl 

r[ 

Rl 

10 

1 

.6768 

.6816 

.6334 

.6305 

2 

,8890 

.8913 

,8535 

.8519 

3 

.9736 

.9745 

.9560 

.9556 

4 

.9958 

.9960 

.9903 

.9903 

5 

.9996 

.9996 

.9985 

.9985 

18 

1 

.7298 

.7303 

.6460 

.6950 

2 

.9260 

.9257 

.9040 

.9035 

3 

.9877 

.9875 

.9800 

.9800 

4 

.9988 

.9988 

.9973 

.9973 

5 

.9999 

.9999 

.9998 

.9998 

30 

1 

.7597 

.7594 

.7337 

.7336 

2 

.9431 

.9426 

.9286 

.9285 

3 

.9925 

.9923 

.9885 

.9885 

4 

.9995 

.9994 

.9989 

.9989 

5 

.99998 

.99998 

.99994 

.99994 

60 

1 

.7865 

.7861 

.7688 

.7691 

2 

.9562 

.9559 

.9478 

.9480 

*2 

.9954 

.9953 

.9936 

,9937 

4 

.9998 

.9998 

.9996 

.9996 

5 

.99999 

.99999 

.99999 

.99999 

Alternatively,  an  approximate  louver  confidence  limit  using  a  Taylor 
series  expansion  of  (i-y)/Sp  in  the  manner  described  in  Section  3  is 


2 

4„  =  K  -  ((1  /  n)  +  (1  /  m)  +  /  2(«  +  m  -  2))2 (9) 

f 

where  K  =  U-y)/Sp.  Yang  (8)  used  Monte  Carlo  simulations  to  evaluate  the 

accuracy  of  =  0(d^^/V2)  for  a  =  .20,  .10,  and  .05.  One  thousand  # 

f 

replications  of  were  generated  for  each  parameter  set  (cr,  m,  n). 

Values  of  1  and  20  were  chosen  for  o  then  values  of  and  /iy  selected  so  that 

R  =  .90,  .95  and  .99.  For  each  of  these  six  parameter  sets,  three  pairs  of  sample 

sizes  were  chosen  for  a  total  of  eighteen  parameter  sets  for  each  value  of  a. 

/ 

The  results  of  the  simulations  are  given  in  Table  2.  If  R  is  an  exact 

TABLE  2.  ANALYSIS  OF  APPROXIMATE  CONFIDENCE  LIMITS  FOR 


P(X>y):  EQUAL  VARIANCES  CASE 


R 

n 

m 

a  =  .20 

^iiomi-a),  V 

Of  =.10 

a  =  .05 

,900 

8 

8 

.8989,  .801 

.8944,  .906 

.8884,  .960 

8 

30 

.9003,  .798 

.9050,  .888 

.8987,  .952 

20 

30 

.9019,  .788 

.9034,  .888 

.9015,  .947 

.950 

8 

8 

,9498,  .801 

.9467,  .909 

.9406,  .962 

8 

30 

.9511,  .7^1 

.9529,  .886 

.9489,  .951 

20 

30 

.9516,  .785 

.9517,  ,889 

.9500,  .950 

.990 

8 

8 

.9901,  ,797 

.9886,  .911 

.9865,  .962 

8 

30 

.9906,  .783 

.9909,  .881 

.9900,  .950 

20 

30 

.9904,  .789 

.9904,  .895 

.9906,  .945 

procedure,  the  values  in  the  column  labeled  ^£,iooo(i-a;  equal  the 

values  of  R  in  the  same  row.  The  symbol  p  denotes  the  proportion  of  the 
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1000  lower  confidence  limits  that  covered  the  value  of  R.  The  results  were 
the  same  for  o  =  1  and  a  =  20. 

2  2  V2 

If  we  assume  Cx^Oy,  let  R  =  P(X>Y)  and  r  =  (/iz-Py)/(a^  +  o  y)  ,  then 
R  =  0{r)  and  a  lower  100(l-a)%  confidence  limit  Ri  for  R  is 

Rl  =  <I>(rz,)  (10) 

where  ri  is  a  lower  confidence  limit  for  r.  Let  f  be  defined  by 

t 

^‘  =  (jf-y)/(s?+s2)2  (11) 

The  general  method  cannot  be  used  to  find  a  lower  confidence  limit  for  r, 
because  the  statistic  f  cannot  be  modified  to  an  equivalent  statistic  whose 
distribution  is  known  with  r  as  the  only  unknown  parameter.  This  is  the 
same  difficulty  encountered  in  the  well-known  Beherns-Fisher  problem 
associated  with  finding  confidence  intervals  for  Pz-/iy.  However  an 
approximate  lower  confidence  limit  for  r  can  be  found  by  using  a  Taylor 
series  expansion  of  f  and  fitting  a  t  distribution  to  the  distribution  of  f  using 
random  degrees  of  freedom,  v,  which  is  computed  from  the  data.  This 
procedure  is  developed  in  Section  3.  The  expression  for  r^a  is 


rLa  =  r- 


where 


(s^  /  M  +  sj  /  m)  /  {si  +  sj)  +  /  (n  - 1)  +  sj  /  (m  - 1))  /  2(s^  +  s^f  ^  ta,v 


(12) 


u  =  (sj  +  /  {si  /  (n  - 1)  +  sj  /  (m  - 1)) 


(13) 


The  method  used  in  this  procedure  to  resolve  the  Beherns-Fisher  type  of 
difficulty  is  similar  to  that  employed  by  Welch  (7).  The  expression  for  v  in 
equation  (11)  is  different  from  that  used  by  Welch. 
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Yang  (8)  has  used  Monte  Carlo  simulations  to  evaluate  the  accuracy  of 
this  procedure  given  by  Equations  (8),  (9),  (10)  and  (11).  The  results  of  his 
simulations  are  given  in  Table  3.  The  symbol  p  denotes  the  proportion  of  the 
1000  randomly  generated  lower  confidence  limits  that  covered  the  value  of  R. 


TABLE  3.  ANALYSIS  OF  APPROXIMATE  CONHDENCE  LIMITS  FOR 
‘  P(X>Y)  FOR  UNEQUAL  VARIANCES  CASE 


R 

CTX 

<Ty 

n 

m 

0 

I^£l000(l-a)/P 

a  =  .20  a  =  .10  a  =  .05 

.900 

1.0 

2.0 

10 

20 

.9008,  .796 

.9008,  .897 

.8959,  .956 

25 

35 

.8996,  .803 

.8993,  .901 

.8988,  .963 

75 

50 

.8997,  .801 

.8990,  .907 

.8998,  .950 

10.0 

40.0 

10 

20 

.9019,  .791 

.9022,  .889 

.9009,  .947 

25 

35 

.9012,  .790 

.9005,  .897 

.8980,  .954 

75 

50 

.9000,  .800 

.8995,  .902 

.8992,  .952 

.950 

1.0 

2.0 

10 

20 

.9500,  .800 

.9497,  .901 

.9461,  .955 

25 

35 

.9502,  .798 

.9502,  .899 

.9482,  .959 

75 

50 

.9496,  .808 

.9493,  .903 

.9500,  .948 

10,0 

40.0 

10 

20 

.9513,  .784 

.9510,  .896 

.9517,  .945 

25 

35 

.9514,  .782 

.9494,  .902 

.9470,  .955 

75 

50 

.9487,  .810 

.9494,  .904 

.9500,  .950 

.990 

1.0 

2.0 

10 

20 

.9902,  .793 

.9899,  .906 

.9888,  .955 

25 

35 

.9903,  .789 

.9899,  .901 

.9891,  .961 

75 

50 

.9898,  .811 

.9899,  .905 

.9899,  .952 

10.0 

40.0 

10 

20 

.9906,  .777 

.9901,  .898 

.9903,  .942 

25 

Ea 

.9907,  .775 

.9904,  .893 

.9892,  .955 

75 

50 

.9899,  .803 

.9898,  .906 

.9900,  .949 

t 
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3.  ANALYSIS 


For  the  single  variate  case  we  fit  a  t  distribution  to  the  distribution  of  the 
statistic  (x-y)/Sx.  Ifg{5:,s)  s  ix-i/)/Sx  ^  k  is  expanded  in  a  Taylor  series  about 
and  the  subscript  x  is  dropped,  then 

g{x,s)  =  {^-y)/ s  +  {x-^)/ a-{s-(T){^-]/)/ s'^. 

Then 

E(k)=  £[x(i, s) j  =  (/i  - y)(4 - 3vl-l/(2n-l)) /  a 

=  (/i-y)/cr  ifn>8,  (14) 

and 

=  var[^^’(.t,s)]  =  var(.v)/o"+((/i-y)/y-)‘var(s) 

=  [\/ n)-¥({^-y)/ af  /2{n-\)  (15) 

2 

Let  =  {\/n)  +  k'^/l{n-\).  It  is  easily  shov^n  that  k  is  a  consistent 
estimator  for  (/i  -  y)/o=  5  and  for  large  n  the  distribution  of  k  will  be 
approximately  normal.  Consequently  an  approximate  lower  confidence  limit 
for  in  -  y)  I  a  would  be  /c  -  d*.  Z«.  We  choose  instead  to  approximate  the 
distribution  of  k  with  a  central  t  distribution  with  m  -  1  degrees  of  freedom 
and  thus  obtain  the  approximate  lovs^er  100(1  -  a)%  confidence  limit,  Sia^  for 
<5  as 

q 

ha  =  ^ -((!/«) +  /2(n-l))2f„  „_-,  (16) 


I 
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As  discussed  in  Section  2,  the  tables  developed  by  Owen  and  Hua  can  be  used 
to  modify  the  right  member  of  Equation  (12)  to  obtain  the  more  accurate 
lower  confidence  limit  5^^  given  by 


\ 

Sla=^-(i^/n)  +  k^/2(n-  (17) 


where  N(a,n)  is  defined  in  Equation  (4). 

In  the  two  variable  cases  when  <7x  =  Oy,  we  fit  a  t  distribution  to  the 
distribution  of  the  statistic  (x  -  y)/Sp.  Expanding  gix-y,  Sp)  =  {x-y)/Sp=  K  in  a 
Taylor  series  about  ~  l^y,o)  and  collecting  terms  we  get 


K  =  =  +  ^x-y-(^x-^y)]/ (y  “(sp  -  cr^)(M;t  "My)  / 


Then 


E{K)  = 


a 


var(K)  =  ((1  /  rt))  +  (1  /  +  /  2o^{n  +  m  -  2) 


Let 


(18) 

(19) 


1 

=  ({l  /  n)  +  {l  /  m)+  k'^  /  2{n  +  m-2))2  (20) 

Proceeding  in  a  manner  similar  to  the  single  variate  case  we  obtain  the 
approximate  lower  confidence  limit,  di  a,  for  d  =  ifix-^iy)/  c  given  by 

dLa  =  ^-  ((1  /  n)  +  {\/m)+K^ /2{n  +  m-  2))?-  ia,n^m-2  (21) 


and 


RLo  =  «>Ka/V2) 


(22) 
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When  Ox^Oy  we  Ht  a  t  distribution  to  the  distribution  of  the  statistic 
f  =  (5:-y)/(Sj+Sy/  .  Expanding  r  in  a  Taylor  series  about  {fix  -  ^y,  +a  y)  and 

collecting  terms  we  get 

'  y)  /  +  sj  -  (a;  +  )  /  2((tj  + 


Consequently 


(23) 


var(f)  =  var(A:  -  v)  /  +  crj )  +  - lUyf  varfsj  +sp  j  /  4(a“  + 

=  ((cTj  ln  +  G~i  m)  /  ))  +  (/^x  - /  (ii  - 1)  +  ctJ  /  (m  - 1))  /  2((T‘  +  ct‘ 


(24) 


Lei 


1 


Of  = 

(sj  /  «  +  Sy  /  m)/ 

(sx+s‘)  +  (r-y)^(s^ /{m-1)  +  s^  /(m- 

l)/2(s2+s2f 

2 

(25) 

An 

approximate 

100(l-a)%  lower  confidence 

limit,  ri, 

for 

r  is 

^La  -  ^  ~ 

(26) 

where  v  is  some  appropriate  degrees  of  freedom.  Then 

^La  =  ^rid) 

(27) 

The  following  method  for  finding  an  "appropriate"  degrees  of  freedom  is 
used  only  to  develop  an  expression  for  the  degrees  of  freedom,  u,  that  makes 
equarion  (23)  sufficiently  accurate.  If  a  strategy  similar  to  that  used  in  the  two 
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previous  interval  methods  is  used  to  find  an  approximate  confidence  limit 
for  r,  we  need  to  find  the  approximate  degrees  of  freedom  for  an  equivalent 
expression  for  r  which  looks  something  like  a  noncentral  t  distribution.  Yang 
[9]  used  a  different  approach  to  the  following  procedure,  but  obtained  the 
same  result.  The  expression 


suggests,  but  is  not,  a  noncentral  t  distribution  with  noncentrality  parameter 

r.  If  we  were  to  use  a  t  distribution  to  fit  the  distribution  of  f,  the  radicand 

2  2  2  2  2 
(s^  +  Sy)/(a^  +  Oy)  should  be  of  the  form  Therefore 

c(.s“+sj.)/(a^  +  a‘)  =  ;f?  (29) 

Using  the  properties  var(;j;'^“  j  =  2c,var|s^  j  =  2cry  /  (n  -  l),var|sy  j  =  2(7^  /  (m  - 1) 
and  taking  the  variance  of  both  sides  of  equation  (26),  we  have 

c“(2aj  /(«-l)  +  2ay +  =2c 

Solving  for  c  we  get 

c  =  [oi^ol'f  /(cl  /(«-!)+ aj/(m-l)) 

Finally  we  let 

v  =  c  =  (sl+Sy^^  /  (si  /{rt-l)  +  sl  /{m- 1))  (30) 
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4.  CONCLUSIONS 


vv  e  hd\f  derived  closed  form  expressions  for  approximate  lower 
confidence  lin  for  the  reliability  of  a  device  when  reliability  is  modeled  as 
P(X>y)  or  P{X>)  )  where  X  and  Y  are  independent  normally  distributed 
variables  with  unknown  means  and  variances.  The  expressions  have  been 
evaluated  for  accuracy  and  demonstrated  to  be  quite  accurate  when  sample 
sizes  are  larger  than  ten.  The  three  confidence  limit  expressions  presented  are 
easy  to  compute  and  can  be  programmed  on  some  existing  hand-held 
calculators.  Percentile  values,  ta,n,  of  the  standard  t  distribution  and  values  of 
the  standardize  normal  cumulative  distribution  function,  0iz),  are  required 
for  each  expression  to  compute  the  low'er  confidence  limit  values  for  given 
data  sets. 

In  mechanical  reliability  settings,  X  usually  denotes  strength  and  Y 
denotes  stress.  However  both  reliability  models,  P(X>Y)  and  P{X>y),  have 
numerous  applications  when  X  and  Y  denote  times  to  first  occurrence  of 
events. 
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